Chapter 2 


Memory Organization 


Memory Hierarchy 


Every computer contain many levels of 
memory, which is called memory hierarchy. 

Main memory, Cache, Auxiliary storage. 









Main Memory 


The main memory is the central storage unit 
in a computer system. 

It is a relatively large and fast memory used 
to store programs and data during the 
computer operation. 

Most of the main memory in a general- 
purpose computer is made up of RAM 
integrated circuit chips, but a portion of the 
memory may be constructed with ROM 
chips. 

RAM and ROM chips are available in a 
variety of sizes. 


RAM Chips 

■ A RAM chip is faster when communicating with CPU 
than dealing with Auxiliary devices directly. 

■ If the memory needed for the computer is larger than 
the capacity of one RAM chip, we combine a 
number of chips to form required memory size. 

■ If we have many RAM chips, we must choose to 
access one of them 

■ We use control inputs to select the chip only when 
needed. 
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8-bit data bus 


128 = 2 7 Address = 7 bits 
8 = 1 byte of data 
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RAM chip function table 


CSl 

CS2 

RD 

WR 

Memory 

function 

State of data bus 

0 

0 

X 

X 

Inhibit 

High-impedance 

0 

1 

X 

X 

Inhibit 

High-impedance 

1 

0 

0 

0 

Inhibit 

High-impedance 

1 

0 

0 

1 

Write 

Input data to RAM 

1 

0 

1 

X 

Read 

Output data from 
RAM 

1 

1 

X 

X 

Inhibit 

High-impedance 


ROM Chips 

■ A ROM chip is organized externally in a similar 
manner. ROM can only read , the data bus can 
only be in an output mode. 

■ For the same-size chip, it is possible to have more 
bits of ROM than of RAM, because the internal 
binary cells in ROM occupy less space than in 
RAM . 

■ For this reason, the diagram specifies a 51 2-byte 
ROM, while the RAM has only 128 bytes. 

■ address lines ■ 9 bits (512 ■ 2 9 ) 

■ The two chip select inputs must be CS1=1 and 
CS2= 0 for the unit to operate. 

■ Otherwise, the data bus is in a high-impedance 
state. 
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Practical Example : How CPU 
deal with RAM and ROM 


Assume that a computer system has: 

■ 512 bytes of RAM (we use 4 blocks of RAM 
of the same type 128X8) 

■ 512 bytes of ROM (single block). 


16 bits address bus 


> bits not used 

1 bit 

2 bits 

7 bits address 


RAM (0) or_^^ \ 

ROM (11 access \ 

RAM chip select 

Of the rest of ROM chip address 
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bits 8 - 9 



RAM 1 RAM 2 RAM 3 RAM 4 
CS1 CS1 CS1 CS1 


We use a decoder to select one of the 4 
RAM chips using bits 8 — 9 by connecting 
the output to CS1 of the RAM Chip 
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(0000 0000 0000 0000) 2 = (0000) 16 
(0000 0000 0111 1111) 2 = ( 007 F ) 16 

(0000 0000 1000 0000) 2 = ( 0080) 16 
(0000 0000 1111 1111) 2 = ( 00 FF ) 16 

(0000 0001 0000 0000) 2 = (0100) 16 
(0000 0001 0111 1111) 2 = ( 017 F ) 16 


Auxiliary Memory 

The most common auxiliary memory 
devices used in computer system are 
magnetic disks, tapes. CDs and DVDs . 

The important characteristic of any device 
are its access mode . access time . transfer 
rate, capacity , : anti\ cost . 

The access time consists of a seek time 
required to position the read-write head to 
location and a transfer time required to 
transfer data to or from the device. 

Because the seek time is usually much 
longer than the transfer time, auxiliary 
storage is organized in records or blocks . 


Auxiliary Memory 


■ A record is a specified number of 
characters or words . 

■ Reading or writing is always done on 
entire records. 


■ The transfer rate is the number of 
characters or words that the device 
can transfer per second, after it has 
been positioned at the beginning of 
the record. 


1° Magnetic Disks 

■ A magnetic desk is a circular plate constructed of 
metal or plastic coated with magnetized material. 

■ Bits are stored in the magnetized surface in spots 
along concentric circles called tracks . 

■ The tracks are commonly divided into sections 
called sectors , usually has fixed size (512 bvte is 
nearly universal sector size) . 

^ Tracks 


Sectors 


Magnetic Disks 


■ Heads: Some units use a single read/write 
head for each disk surface. 

■ In other disk systems, separate read/write 
heads are provided for each track in each 
surface. 

■ Disks that are permanently attached to the 
unit assembly and cannot be removed by 
the occasional user are called hard disks . 

m A disk driver with removable disks is called 
a floppy disk . 

m The disks used with a floppy disk driver are 
small removable disks made of plastic 
coated with magnetic recording material. 



Magnetic Disks 

Read/ write mechanisms: 

■ Data are recorded then later retrieved via 
conducting coil named head, 

■ The disk surface is coated with magnetizable 
material. 

■ Some systems have two heads: read and 
write. 

■ During a read or write the head is stationary 
while the disk platter rotates under it. 



The write mechanism: 


■ Uses the idea that electricity flowing 
through a coil produces a magnetic 
field. 

■ Data pulses are sent to the write 
head, and magnetic patterns are 
recorded on the surface below, with 
different patterns for positive and 
negative currents. 

■ The 1 and 0 are stored as two 
different directions of magnetization, 
through sending currents in two 
different directions to the writing 
head. 


Magnetic Disks 

The Read mechanism: 


■ The idea of reading is based on the fact that a 
magnetic field moving relative to a coil 
produces an electrical current in the coil. 

■ When a surface of a disk passes under the 
reading head, it generates a current of some 
polarity (direction) as the one already 
recorded. 

■ As seen the idea of reading is similar to that or 
writing, that is why some systems have single 
read/write head (floppy disk system and old 
hard disk systems). 

■ In new systems, they use MR head for 
reading. 
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Magnetic Disks Internal 

v 

Structure 
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Magnets-For-Wind-Turbines.htm 




Magnetic Disks 

■ Data organization on the disk has two 
methods: 

1 - constant angular velocity 

■ The disk is divided into a number of pie- 
shaped sectors. 

■ Each track are divided into equal number of 
sectors. 

■ Thus, the internal track sector’s size are very 
small compared to the outer track sector. 

■ Advantage: ease of reaching the data 

■ Disadvantage: amount of data stored on the 
outer tracks is the same as what can be 
stored on the short inner tracks. 

■ Density of data on track is limited by the 
inner sector size. 


(a) Constant angular vdocit) 




X) 


<h> Multiple zoned recording 



Magnetic Disks 

2- multiple zoned recording 

■ Used to increase the track storage density. 

■ The disk surface is divided into zones 
(typically 16). 

■ Within each zone, the number of bits per 
track is constant 

■ Zones near the center have less bits (less 
sectors) than the outer zones. 

■ Advantage : The overall capacity is higher. 

■ Disadvantage : the circuit used for reading 
and writing is more complex. 

■ We need a mean to locate sector positions 
within the track, so we need to identify the 
start and end of each sector. 



Disk track format 
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Hard Disk 4- multiple heads (upper and lower) 
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Data Organization and 
Formatting 

Sectors Tracks 


Time to read/write from a disk 


■ Total time = Ts (average seek 
time) + transfer time 

■ Transfer time = b (no. of bytes to 
be transferred)/ N (no. of bytes 
in a track)* r(rotation speed) 


2- Magnetic Tape 


■ The tape is a strip of plastic coated 
with a magnetic recording medium. 

■ Bits are recorded as magnetic spots 
on the tape along several parallel 
tracks. 

■ A tape unit is addressed by 
specifying the record number and 
the number of characters in the 
record, ( sequential access ) 

■ Records may be of fixed or variable 
length. 


Magnetic Tape 


■ It is analogous to tape recorder system. 

■ Old tapes had 9 Tracks. 

■ This enabled storing one byte (8 bits) at 
parallel with the ninth bit as parity bit. 

■ Later, the tap had 1 8 or 36 tracks, 
corresponding to digital word size. 

m This type of recoding is called parallel 
recording . 


Magnetic Tape 

Most modern systems use serial recording. 

Data are laid out as a sequence of bits 
along each track, as done with magnetic 
disks. 

Information is recorded and read in 
contiguous blocks referred to as records. 

Blocks are separated by gaps called inter- 
record gaps. 

The tap is formatted to assist in locating 
records. 

The recoding technique used in serial tapes 
is referred to as serpentine recording. 


Magnetic Tape 

The data is recoded along the whole 
length of tape. 

When the end of the tape is reached, 
the heads are repositioned to record a 
new track, but this time the recording is 
one the opposite direction. 

To increase speed, the read-write head 
is capable of reading and writing a no. 
of adjacent tracks simultaneously (2-8 
tracks) . 

Data are still recorded serially along 
individual tracks, but blocks in 
sequence are stored on a adjacent 

trpr.ks 


Optical memory 

■ Compact disk (CD) in 1983. 

■ Nonerasable disk, store more than 60 
mins, of audio information on one side. 

CD-ROM (Read only): 

■ CD and CD-ROM has the similar 
technology. 

■ The disk is made from polycarbonate . 

■ The digital data is imprinted as a series 
of microscopic pits on the surface of 
the polycarbonate. 




The master disk is imprinted with a 
finely focused, hiah-intensitv laser . 



CD-ROM 

The master used to make a die to stamp 
out copies onto polycarbonate. 

The pitted surface is then coated with a 
highly reflective surface, usually aluminum . 

This shiny surface is protected against dust 
and scratches by a top coat of clear acrylic . 

Finally, a label can be silkscreened onto 
the acrylic. 
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Reading data from CD 

Read using low-powered laser 
housed in an optical disk player. 

The laser shines through clear 
polycarbonate while a motor spins the 
disk. 

The intensity of the reflected light of 
laser changes as it encounters a pit. 

The beginning or end of a pit represents 
a 1, when no change in elevation occurs 
between intervals, a 0 is recorded. 

The disk has a single spiral track , 
beginning near the center and spiraling 
out to the outer edge. 

Sectors near the outside are the same* 
size as those near the inside. 


CD Recordable 


■ Called write-once read-many CD. 

■ Used in applications in which only 
one or small copies of a set of data is 
needed. 

■ The disk is prepared to be written 
with a modest intensity laser beam. 

■ The disk medium includes a dye 
layer. 

■ The die is used to change reflectivity 
by making pits with different 
reflectivity on the surface of the 
medium. 


CD Rewirtabl© 

■ Can be written and overwritten repeatedly 

■ The approach used in this type is called 
Phase Change . 

■ It uses a material with two significantly 
reflectivities in two different states , called 
amorphous state and crystalline state. 

■ A beam of laser can change the material 
from one phase to the other. 

■ Disadvantage : 

■ The material looses its desirable 
properties over usage time. (500,000 to 
1,000,000 erase cycle) 
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Digital Versatile Disk (DVD) 

■ It replaces video cassette, video 
tapes and CD-ROMs. 

■ It has huge storage due to three 
differences from CD: 

1- Bits are packed more closely on a 
DVD. (1/2 CD) 

2- Has second layer of pits and lands 
on the top of the first layer (dual- 
layer). It has a semi-reflective layer 
on top of the reflective layer. By 
adjusting focus, the lasers in DVD 
drivers can read each layer 
separately (2 CD). 

3- Has two sides whereas CD has one 
side. 


Protective layer 
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Polycarbonate substrate 
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Polycarbonate substrate, side 2 
Semi reflective layeu side 2 
Polycarbonate layer, side 2 
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Sc mireflectiye layer, side I 
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Laser focuses on polycarbonate 
pits in front of teflecthe layer. 
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ta) CD-ROM • Capacity Ml MB 
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thick 


Laser locusts on pits in one layer 
on one side at a time. Disk must 
be flipped to lead other side. 


4b) DVD-ROM. double-sided, dual-layer - Capacity 17 GB 


Cache Memory 

■ Cache memory is intended to give memory 
speed approaching that of the fastest memories 
available. 

■ It is made of SRAM which is faster than DRAM. 

■ The cache contains a copy of portions of main 
memory . 

■ When the processor attempts to read a word of 
memory, a check is made to determine if the 
word is in the cache. 



Cache Memory 

■ If so, the word is delivered to the processor. 

■ If not, a block of main memory , consisting of 
some fixed number of words, is read into 
cache and then the word is delivered to the 
processor. 

■ Why do we get blocks of data? 

Because of the phenomenon of locality of 

reference. 

■ When a block of data is fetched into the cache 
to satisfy a single memory reference, it is 
likely that there will be future reference to that 
same memory location or to other words in 
the block. 
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■ Each line in cache contains a block and a tag. 

■ At any time, some subset of the blocks of memory 
resides in lines in the cache . 

■ If a word in a block of memory is read, that block 
is transferred to one of the lines of the cache. 

■ Because there are more blocks than lines, an 
individual line cannot be uniquely and 
permanently dedicated to a particular block. 

■ Thus, each line includes a tag that identifies which 
particular block is currently being stored . 

■ The tag is usually a portion of the main memory 
address . 

Line Memory 

number Tap Bltxk addrext 
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Mapping Function 



Because there are fewer cache lines than main memory 
blocks, an algorithm is needed for mapping (allocating) 
main memory blocks into cache lines. 

Further, a means is needed for determining which main 
memory block currently occupies a cache line . 

The choice of the mapping function dictates how the 
cache is organized. 
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1“ Direct mapping 

The easiest way. 

Each block of main memory is 
mapped into only one possible cache 
line. 
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Example 


■ The cache can hold 64 Kbytes . 

■ Data is transferred between main memory 
and the cache in blocks of 4 bytes each. 

■ This means that the cache is organized as 
1 6K lines of 4 bytes each. 

■ The main memory consists of 1 6 Mbytes , 
with each byte directly addressable by a 24- 
bit address (= 16M). 

■ Thus, for mapping purposes, we can 
consider main memory to consist of 4M 
blocks of 4 bytes each. 


2- Associative Mapping 

■ Associative mapping overcome the 
disadvantages of direct mapping by 




3- Set Associative Mapping 

Set associative mapping is compromise 
that exhibits the strengths of both the 
direct and associative approaches while 
reducing their disadvantages. 

In this case, the cache is divided into v 
sets, each of which consists of k lines. 

This is referred to as k - way set 
associative mapping. 

With set associative mapping, block , can 
be mapped into any of the lines of set i. 

Set associative mapping with two lines in 
each set, referred to as two-way set 
associative. 


Replacement Algorithms 


■ When a new block is brought into the 
cache, one of the existing blocks 
must be replaced. 

■ For direct mapping , there is only one 
possible line for any particular block, 
and no choice is possible. 

■ For the associative and set 

associative techniques, a 

replacement algorithm is needed. 

■ To achieve high speed, such an 
algorithm must be implemented in n 
hardware. 


1- First-In-First-Out (FIFO) 


■ Replace that block in the set 
that has been in the cache 
longest. 

■ FIFO is easily implemented. 

■ Hit ratio = probability that a word 
is found in the cache 
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2- Least recently used (LRU) 
(time factor) 

Replace that block in the set that has 
been in the cache longest with no 
reference to it. 

For example, in two-way set 
associative, this is easily implemented. 

Each line includes a USE bit. 

When a lines referenced, its USE =1 
and the USE bit of the other line in 
that set ■ 0. 

When a block is to be read into the set, 
the line whose USE bit is 0 is used . 

Because we are assuming that more 
recently used memory locations are 
more likely to be referenced, LRU 

should nivo tho host hit ratio 


3- Least Frequently Used 

(LFU) (number of usage) 

■ Replace that block in the set 
that has experienced the fewest 
references. 

■ LFU could be implemented by 
associating a counter with each 
line. 




4- Random Replace 

■ Pick a line at random from 
among the candidate lines. 

■ Simulation studies have shown 
that random replacement 
provides only slightly inferior 
performance to an algorithm 
based on usage. 


Write Policy 

An important aspect of cache 
organization is concerned with 
memory write requests. 

When the CPU finds a word in cache 
during a read operation, the main 
memory is not involved in the transfer. 

However, if the operation is a write, 
there are two ways that the system 
can proceed: 

Write-through 

Write-back 


1 . Write-through 

■ The simplest and most commonly used 
procedure. 

■ Update main memory with every 
memory write operation, with cache 
memory being updated in parallel if it 
contains the word at the specific 
address. 

■ This method has the advantages that 
main memory always contains the 
same data as cache. 

■ This characteristic is important in 
systems with direct memory access., 
transfers. 


1 . Write-throuah 


■ It ensures that the data residing 
in main memory are valid at all 
times so that an I/O device 
communicating through DMA 
would receive the most recent 
updated data. 

■ However, it has the 
disadvantage of high memory 
traffic and may create a 
bottleneck. 


2. Write-back 


■ Only the cache location is updated 
during a write operation. 

■ The location is then marked by a flag 
(called update bit or dirty bit ) so that 
later when the word is removed 
(replaced) from the cache, it is copied 
into main memory. 

■ The reason for the write back method is 
that during the time a word resides in 
the cache, it may be updated several 
times; 

■ However, as long as the word remains 
in the cache, it does not matter whether 
the copy in main memory is out of date, 
since requests from the word are filled 
from the cache. 


2. Write-back 


■ It is only when the word is displaced 
from the cache that an accurate copy 
need be rewritten into main memory. 

■ Analytical results indicate that the 
number of memory writes in a typical 
program ranges between 10 and 
30% of the total references to 
memory. 

■ The problem of this technique is that 
Dortions of memory are invalid and 
:hus access from I/O modules to this 
data can be through cache instead. 

■ This will require complex circuitry 
and potential bottleneck. 


Multiple caches 

A computer now has multiple caches. 

A cache can be placed on the same 
chip with the processor: the on-chip 
cache (LI). 

It reduces the processor’s external 
bus activity and therefore speed up 
the execution times and increase the 
overall performance. 

This is merged with the external 
cache (L2) giving two-level cache 
organization. 

This will complicate the circuit and will 
require replacement algorithms and « 
write back polices. 


Unified versus split caches 

■ We can also have two caches: one for 
holding the instructions and the other for 
holding the data. 

Advantage of unified cache: 

■ It has higher hit rate than split because it 
balances the load between instruction 
and data fetches. If the execution 
requires more instruction fetches, then it 
will use the cache size for instruction and 
if the execution requires more data 
fetches, then it will use the cache size for 
data. Unlike the split, which will end up 
filling the instruction size while the data is 
free or vice versa. 

■ Only one cache need to be designed and 
implemented. 


Unified versus split caches 


■ Despite that, most modern system, 
specially the parallel systems, tend 
to have split cache. 

■ It has the advantage of eliminating 
the contention for the cache between 
the fetch/decode unit and the 
execution unit. 

■ Thus, the two operations can be 
made in parallel, on the same time, 
on the two caches. 
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Pentium 4 Cache 
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